Cluster-based information retrieval by using (K-means)-hierarchical parallel genetic algorithms approach

نویسندگان

چکیده

Cluster-based information retrieval is one of the (IR) tools that organize, extract features and categorize web documents according to their similarity. Unlike traditional approaches, cluster-based IR fast in processing large datasets document. To improve quality retrieved documents, increase efficiency reduce irrelevant from user search. In this paper, we proposed a (K-means)-hierarchical parallel genetic algorithms approach (HPGA) combines K-means clustering algorithm with hybrid PG multi-deme master/slave algorithms. uses cluster population k subpopulations then take most clusters relevant query manipulate way by two levels parallelism, thus, will not be included subpopulations, as results. Three common (NLP, CISI, CACM) are used compute recall, precision, F-measure averages. Finally, compared precision values three Genetic-IR classic-IR. The improvements IR-GA were 45% CACM, 27% 25% NLP. While, comparing Classic-IR, (K-means)-HPGA got 47% 28% 34%

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Analysis of Parallel K Means and Parallel Fuzzy C Means Cluster Algorithms

In this paper, we give a short review of recent developments in clustering. Clustering is the process of grouping of data, where the grouping is established by finding similarities between data based on their characteristics. Such groups are termed as Clusters. Clustering is a procedure to organizing the objects into groups or clustered together, based on the principle of maximizing the intra-c...

متن کامل

Effective Information Retrieval using Genetic Algorithms based Matching Functions Adaptation

Knowledge intensive organizations have vast array of information contained in large document repositories. With the advent of E-commerce and corporate intranets/extranets, these repositories are expected to grow at a fast pace. This explosive growth has led to huge, fragmented, and unstructured document collections. Although it has become easier to collect and store information in document coll...

متن کامل

Parallel Implementation of Genetic Algorithm using K-Means Clustering

-----------------------------------------------------------------ABSTRACT-------------------------------------------------------The existing clustering algorithm has a sequential execution of the data. The speed of the execution is very less and more time is taken for the execution of a single data. A new algorithm Parallel Implementation of Genetic Algorithm using KMeans Clustering (PIGAKM) is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: TELKOMNIKA Telecommunication Computing Electronics and Control

سال: 2021

ISSN: ['1693-6930', '2302-9293']

DOI: https://doi.org/10.12928/telkomnika.v19i1.16734